Centering Theory in natural text: a large-scale corpus study
نویسندگان
چکیده
We present an extensive corpus study of Centering Theory (CT), examining how adequately CT models coherence in a large body of natural text. A novel analysis of transition bigrams provides strong empirical support for several CT-related linguistic claims which so far have been investigated only on various small data sets. The study also reveals genre-based differences in texts’ degrees of entity coherence. Previous work has shown unsupervised CTbased coherence metrics to be unable to outperform a simple baseline. We identify two reasons: 1) these metrics assume that some transition types are more coherent and that they occur more frequently than others, but in our corpus the latter is not the case; and 2) the original sentence order of a document and a random permutation of its sentences differ mostly in the fraction of entity-sharing sentence pairs, exactly the factor measured by the baseline.
منابع مشابه
Evaluating Centering-Based Metrics of Coherence
We use a reliably annotated corpus to compare metrics of coherence based on Centering Theory with respect to their potential usefulness for text structuring in natural language generation. Previous corpus-based evaluations of the coherence of text according to Centering did not compare the coherence of the chosen text structure with that of the possible alternatives. A corpusbased methodology i...
متن کاملZero Pronoun Resolution in Thai: A Centering Approach
Since pronouns can be dropped in Thai, a natural language processing system for Thai must be able to resolve referents of the missing pronouns. One of several approaches that have been used for reference resolution is Centering Theory. Centering Theory is a focusing process in which salience of discourse entities is being kept track of. Referents of pronouns or zero pronouns are usually entitie...
متن کاملEvaluating Centering for Information Ordering Using Corpora
In this article we discuss several metrics of coherence defined using centering theory and investigate the usefulness of such metrics for information ordering in automatic text generation. We estimate empirically which is the most promising metric and how useful this metric is using a general methodology applied on several corpora. Our main result is that the simplest metric (which relies exclu...
متن کاملDiscourse and Coherence: Revisiting Specific Conventions of the Centering Theory
This paper discusses a corpus-based study whose aim is to evaluate specific conventions of the centering theory and to establish whether they should be revisited. In particular, the study explores the relation between discourse coherence and several parameters such as the definition of an utterance, the varieties of anaphora considered, the forms of the discourse entities and the type of genre.
متن کاملA Centering Dynamics Approach to Zero Pronouns in Korean*
Kim, Mi-Kyung. 2003. A Centering Dynamics Approach to Zero Pronouns in Korean. Discourse and Cognition 10.3, 57-73. This study examines 249 utterances from a corpus of newspaper articles, focusing on the distribution of zero subjects and zero objects. Two kinds of approach, centering theory in Grosz et al. (1995) and centering dynamics in Kameyama (1996, 1998), are compared to find out which ap...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014